Minimizing the disclosure risk of semantic correlations in document sanitization

نویسندگان

  • David Sánchez
  • Montserrat Batet
  • Alexandre Viejo
چکیده

0020-0255/$ see front matter 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ins.2013.06.042 ⇑ Corresponding author. Address: Departament d’Enginyeria Informàtica i Matemàtiques, Universitat Rovira i Virgili., Av., Països Catalans, 2 Tarragona, Spain. Tel.: +34 977 559657; fax: +34 977 559710. E-mail address: [email protected] (D. Sánchez). David Sánchez ⇑, Montserrat Batet, Alexandre Viejo

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Utility-preserving sanitization of semantically correlated terms in textual documents

Traditionally, redaction has been the method chosen to mitigate the privacy issues related to the declassification of textual documents containing sensitive data. This process is based on removing sensitive words in the documents prior to their release and has the undesired side effect of severely reducing the utility of the content. Document sanitization is a recent alternative to redaction, w...

متن کامل

Data sanitization in association rule mining based on impact factor

Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...

متن کامل

Detecting Term Relationships to Improve Textual Document Sanitization

Nowadays, the publication of textual documents provides critical benefits to scientific research and business scenarios where information analysis plays an essential role. Nevertheless, the possible existence of identifying or confidential data in this kind of documents motivates the use of measures to sanitize sensitive information before being published, while keeping the innocuous data unmod...

متن کامل

Document Sanitization: Measuring Search Engine Information Loss and Risk of Disclosure for the Wikileaks cables

In this paper we evaluate the effect of a document sanitization process on a set of information retrieval metrics, in order to measure information loss and risk of disclosure. As an example document set, we use a subset of the Wikileaks Cables, made up of documents relating to five key news items which were revealed by the cables. In order to sanitize the documents we have developed a semi-auto...

متن کامل

Automatic Declassification of Textual Documents by Generalizing Sensitive Terms

With the advent of internet, large numbers of text documents are published and shared every day . Each of these documents is a collection of vast amount of information. Publically sharing of some of this information may affect the privacy of the document, if they are confidential information. So before document publishing, sanitization operations are performed on the document for preserving the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Sci.

دوره 249  شماره 

صفحات  -

تاریخ انتشار 2013